79 research outputs found
Automatic Induction of Classification Rules from Examples Using N-Prism
www.dis.port.ac.uk/~bramerma One of the key technologies of data mining is the automatic induction of rules from examples, particularly the induction of classification rules. Most work in this field has concentrated on the generation of such rules in the intermediate form of decision trees. An alternative approach is to generate modular classification rules directly from the examples. This paper seeks to establish a revised form of the rule generation algorithm Prism as a credible candidate for use in the automatic induction of classification rules from examples in practical domains where noise may be present and where predicting the classification for previously unseen instances is the primary focus of attention
A scalable expressive ensemble learning using Random Prism: a MapReduce approach
The induction of classification rules from previously unseen examples is one of the most important data mining tasks in science as well as commercial applications. In order to reduce the influence of noise in the data, ensemble learners are often applied. However, most ensemble learners are based on decision tree classifiers which are affected by noise. The Random Prism classifier has recently been proposed as an alternative to the popular Random Forests classifier, which is based on decision trees. Random Prism is based on the Prism family of algorithms, which is more robust to noise. However, like most ensemble classification approaches, Random Prism also does not scale well on large training data. This paper presents a thorough discussion of Random Prism and a recently proposed parallel version of it called Parallel Random Prism. Parallel Random Prism is based on the MapReduce programming paradigm. The paper provides, for the first time, novel theoretical analysis of the proposed technique and in-depth experimental study that show that Parallel Random Prism scales well on a large number of training examples, a large number of data features and a large number of processors. Expressiveness of decision rules that our technique produces makes it a natural choice for Big Data applications where informed decision making increases the userâs trust in the system
An Overview of the Use of Neural Networks for Data Mining Tasks
In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks
The global epidemiology of hepatitis E virus infection: A systematic review and meta-analysis
Background and aims: Hepatitis E virus (HEV), as an emerging zoonotic pathogen, is a leading cause of acute viral hepatitis worldwide, with a high risk of developing chronic infection in immunocompromised patients. However, the global epidemiology of HEV infection has not been comprehensively assessed. This study aims to map the global prevalence and identify the risk factors of HEV infection by performing a systematic review and meta-analysis. Methods: A systematic searching of articles published in Medline, Embase, Web of science, Cochrane and Google scholar databases till July 2019 was conducted to identify studies with HEV prevalence data. Pooled prevalence among different countries and continents was estimated. HEV IgG seroprevalence of subgroups was compared and risk factors for HEV infection were evaluated using odd ratios (OR). Results: We identified 419 related studies which comprised of 1Â 519Â 872 individuals. A total of 1Â 099Â 717 participants pooled from 287 studies of general population estimated a global anti-HEV IgG seroprevalence of 12.47% (95% CI 10.42-14.67; I2Â =Â 100%). Notably, the use of ELISA kits from different manufacturers has a substantial impact on the global estimation of anti-HEV IgG seroprevalence. The pooled estimate of anti-HEV IgM seroprevalence based on 98 studies is 1.47% (95% CI 1.14-1.85; I2Â =Â 99%). The overall estimate of HEV viral RNA-positive rate in general population is 0.20% (95% CI 0.15-0.25; I2Â =Â 98%). Consumption of raw meat (PÂ =.0001), exposure to soil (PÂ <.0001), blood transfusion (PÂ =.0138), travelling to endemic areas (PÂ =.0244), contacting with dogs (PÂ =.0416), living in rural areas (PÂ =.0349) and receiving education less than elementary school (PÂ <.0001) were identified as risk factors for anti-HEV IgG positivity. Conclusions: Globally, approximately 939Â million corresponding to 1 in 8 individuals have ever experienced HEV infection. 15-110Â million individuals have recent or ongoing HEV infection. Our study highlights the substantial burden of HEV infection and calls for increasing routine screening and preventive measures
Direct-acting antiviral agents for liver transplant recipients with recurrent genotype 1 hepatitis C virus infection: Systematic review and meta-analysis
Background: Comprehensive evaluation of safety and efficacy of different combinaâ
tions of directâacting antivirals (DAAs) in liver transplant recipients with genotype 1
(GT1) hepatitis C virus (HCV) recurrence remains limited. Therefore, we performed
this systematic review and metaâanalysis in order to evaluate the clinical outcome of
DAA treatment in liver transplant patients with HCV GT1 recurrence.
Methods: Studies were included if they contained information of 12 weeks sustained
virologic response (SVR12) after DAA treatment completion as well as treatment reâ
lated complications for liver transplant recipients with GT1 HCV recurrence.
Results: We identified 16 studies comprising 885 patients. The overall pooled estiâ
mate proportion of SVR12 was 93% (95% confidence interval (CI): 0.89, 0.96), with
moderate heterogeneity observed (Ï
2 = 0.01, P < 0.01, I
2
=75%). High tolerability was
observed in liver transplant recipients reflected by serious adverse events (sAEs) with
pooled estimate proportion of 4% (95% CI: 0.01, 0.07; Ï2 = 0.02, P < 0.01, I
2 = 81%).
For subgroup analysis, a total of five different DAA regimens were applied for treating
these patients. Sofosbuvir/Ledipasvir (SOF/LDV) led the highest pooled estimate
SVR12 proportion, followed by Paritaprevir/Ritonavir/Ombitasivir/Dasabuvir (PrOD),
Daclatasvir (DCV)/Simeprevir (SMV) ± Ribavirin (RBV), and SOF/SMV ± RBV,
Asunaprevir (ASV)/DCV. There was a tendency for favoring a higher pooled SVR12
proportion in patients with METAVIR Stage F0âF2 of 97% (95% CI: 0.93, 0.99) comâ
pared to 85% (95% CI: 0.79, 0.90) for stage F3âF4 (P < 0.01). There was no significant
difference between LT recipients treated with or without RBV (P = 0.23).
Conclusions: Directâacting antiviral treatment is highly effective and wellâtolerated
in liver transplant recipients with recurrent GT1 HCV infection
The barriers and facilitators influencing the sustainability of hospital-based interventions: a systematic review
Acknowledgements University of Stirling for providing financial support for open access costs Funding This review was funded by the Chief Scientist Office, grant number GCA/17/26. JC, PC and EAD are employed by the Nursing, Midwifery and Allied Health Professions Research Unit, which is funded by the Chief Scientist Office in Scotland.Peer reviewedPublisher PD
Tobacco use induces anti-apoptotic, proliferative patterns of gene expression in circulating leukocytes of Caucasian males
Abstract Background Strong epidemiologic evidence correlates tobacco use with a variety of serious adverse health effects, but the biological mechanisms that produce these effects remain elusive. Results We analyzed gene transcription data to identify expression spectra related to tobacco use in circulating leukocytes of 67 Caucasian male subjects. Levels of cotinine, a nicotine metabolite, were used as a surrogate marker for tobacco exposure. Significance Analysis of Microarray and Gene Set Analysis identified 109 genes in 16 gene sets whose transcription levels were differentially regulated by nicotine exposure. We subsequently analyzed this gene set by hyperclustering, a technique that allows the data to be clustered by both expression ratio and gene annotation (e.g. Gene Ontologies). Conclusion Our results demonstrate that tobacco use affects transcription of groups of genes that are involved in proliferation and apoptosis in circulating leukocytes. These transcriptional effects include a repertoire of transcriptional changes likely to increase the incidence of neoplasia through an altered expression of genes associated with transcription and signaling, interferon responses and repression of apoptotic pathways
The global impact of non-communicable diseases on macro-economic productivity: a systematic review
© 2015, The Author(s). Non-communicable diseases (NCDs) have large economic impact at multiple levels. To systematically review the literature investigating the economic impact of NCDs [including coronary heart disease (CHD), stroke, type 2 diabetes mellitus (DM), cancer (lung, colon, cervical and breast), chronic obstructive pulmonary disease (COPD) and chronic kidney disease (CKD)] on macro-economic productivity. Systematic search, up to November 6th 2014, of medical databases (Medline, Embase and Google Scholar) without language restrictions. To identify additional publications, we searched the reference lists of retrieved studies and contacted authors in the field. Randomized controlled trials, cohort, caseâcontrol, cross-sectional, ecological studies and modelling studies carried out in adults (>18 years old) were included. Two independent reviewers performed all abstract and full text selection. Disagreements were resolved through consensus or consulting a third reviewer. Two independent reviewers extracted data using a predesigned data collection form. Main outcome measure was the impact of the selected NCDs on productivity, measured in DALYs, productivity costs, and labor market participation, including unemployment, return to work and sick leave. From 4542 references, 126 studies met the inclusion criteria, many of which focused on the impact of more than one NCD on productivity. Breast cancer was the most common (n = 45), followed by stroke (n = 31), COPD (n = 24), colon cancer (n = 24), DM (n = 22), lung cancer (n = 16), CVD (n = 15), cervical cancer (n = 7) and CKD (n = 2). Four studies were from the WHO African Region, 52 from the European Region, 53 from the Region of the Americas and 16 from the Western Pacific Region, one from the Eastern Mediterranean Region and none from South East Asia. We found large regional differences in DALYs attributable to NCDs but especially for cervical and lung cancer. Productivity losses in the USA ranged from 88 million US dollars (USD) for COPD to 20.9 billion USD for colon cancer. CHD costs the Australian economy 13.2 billion USD per year. People with DM, COPD and survivors of breast and especially lung cancer are at a higher risk of reduced labor market participation. Overall NCDs generate a large impact on macro-economic productivity in most WHO regions irrespective of continent and income. The absolute global impact in terms of dollars and DALYs remains an elusive challenge due to the wide heterogeneity in the included studies as well as limited information from low- and middle-income countries.WHO; NestleÂŽ Nutrition (Nestec Ltd.); Metagenics Inc.; and AX
- âŠ